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Abstract: The goal of this thesis is to create a program for Windows that 
takes a compiled iOS application and emulates it. However, only the appli- 
cation’s machine code is emulated, whereas system functionality originally 
provided by iOS is translated to an equivalent functionality available on 
Windows. Hence, the emulated application employs a user interface and 
behavior that feel native on the target platform. At compile time, custom 
machine code is generated that supports the translation at runtime. The 
thesis also describes 1OS's internals that the emulator needs to imitate and 
discusses different approaches to cross-platform development. 
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Introduction 


Nowadays, almost everybody owns a smartphone and hence all imaginable 
kinds of applications and games are developed for mobile devices. Dominant 
operating systems in smartphones are Android and iOS [CR18]. In order to 
reach as many users as possible, developers usually create their applications 
for both platforms. 

Many modern applications are developed for smartphones and no desk- 
top counterparts of them exist or are not being actively developed. These 
applications could benefit from being available on desktop computers but 
writing applications for different platforms requires lots of work, so devel- 
opers often prefer some platforms over others and if they do so, they usually 
prefer mobile platforms. 

DT T (UWD) is a less-known platform for appli- 
cation development [W* 18]. It is discussed further in Section Ap- 
plications written for can run on both Windows Mobile devices and 
Windows 10 computers. Although Windows Mobile has almost zero mar- 
ket share among operating systems powering today’s smartphones [CRIS], 
Windows is leading operating system on desktop personal computers and 


laptops [Sta19]. 


Thesis goals 


The goal of this thesis is to present a |UWP| application that can execute 
applications originally written for iOS. Our application is an emulator, but 
it only emulates the target iOS application. Neither iOS libraries nor the 
whole iOS are emulated. 

Such emulator needs to cover lots of operating system functions in or- 
der to be any useful. Therefore, writing it entirely from scratch would be a 
time-consuming task. Fortunately, there is a library containing translations 
of many iOS functions into Windows 
(APIs)| [Den 15]. It is a Microsoft's library called WinObjC and it is discussed 
in Section 2.2.1]in more detail. By using this library, we can focus on more 
interesting areas when implementing our application—e.g., emulation and 
translation of system calls between different architectures. 

Since our application is an emulator, it takes a compiled iOS application 
as an input. Hence, it can be used without having to recompile the applica- 
tion’s source code or even having the source code available. By employing our 
emulator, developers could reach more users while still writing applications 
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for only few platforms. 


Related work 


Similar software exists also on other platforms. For Linux, there is a soft- 
ware layer called Wine which works almost exactly like our emulator [AJ94]. 
The difference is that Wine enables users to run classic Windows programs 
(i.e., binaries using Win32 on Linux operating systems. 

Also for Linux, there is a software layer named Darling which is very sim- 
ilar to Wine [Dol17]. It is actually more related to our emulator than Wine 
since it can load and execute macOS binaries (which have the same format 
as iOS binaries). However, it is not an emulator because machine code of 
the binaries is executed directly. Furthermore, it is far from complete—only 
simple console applications can be run on top of it [Dol19]. 

Yet another similar project is called Anbox and it enables users to run An- 
droid applications on Linux systems in much the same way as the previously 
mentioned projects [Fel17]. Unlike Darling, Anbox contains an emulator for 
cases when an Android application needs to run native [Advanced RISC Ma] 
code (otherwise, Android applications are compiled into Java 
bytecode and hence easily portable). 

Cider is another project with similar goal but different implementation 
approach [AVHA* 14]. It brings iOS applications to Android devices. How- 
ever, instead of translating functions from 1OS system libraries (like our em- 
ulator and the other related projects mentioned above do), it embeds these 
libraries in binary format (i.e., they contain their original machine code) and 
does the translation one level lower—at syscalls. 

Translation at the syscall level is also described by Dreyfus who 
presents a compatibility layer that enables macOS applications to run on 
NetBSD. 


Layout of the thesis 


In Chapter [1] we describe iOS internals and concepts that are essential for 
an iOS emulator. There is a description of Mach-O binary format, iOS dy- 
namic loader, fundamentals of Objective-C and its runtime. 

Different approaches to cross-platform development are discussed and 
compared in Chapter 2] These include different kinds of emulation. We also 
discuss binary-compatibility problems in that chapter. 
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In Chapter |3| we present our implementation and problems we faced 
along the way. 


1. iOS 


Apple’s operating system for mobile devices is called iOS. It runs on Apple’s 
products like iPhones, iPods and iPads. Applications written for iOS are 
distributed through Apple's AppStore [App19]. 

Since we present an iOS application emulator in this thesis, it can be 
useful to know more details about the way iOS applications are executed on 
the real system. Ideally, these applications should not know they are being 
emulated, so the emulator needs to behave to them just like 108 does when 
they are executed on an iPhone as closely as possible. Therefore, we describe 
relevant iOS’s behavior in this chapter. In Chapter [3] we describe how our 
emulator imitates that behavior. 


1.1 iOS binaries 


On Apple’s systems, executable code is packed inside files in Mach-O for- 
mat [Lev12]. Such files are called binaries. Equivalents of the Mach-O 


binary format are |Portable Executable (PE)[Common Object File Format 
(COFF)|on Windows and|Executable and Linkable Format (ELF)|on Linux. 


Binaries are called executable if they have an entrypoint, i.e., a single 
place in the machine code where execution can begin. Such binaries are 
executed by the operating system in several possible ways, e.g., automati- 
cally when the operating system starts or manually by the user launching 
an application. 

Non-executable binaries are called libraries. Static libraries are used at 
compile time and therefore are not interesting for our emulator. Dynamic li- 
braries (also called shared libraries) are pieces of code used by an executable 
or other dynamic libraries. The same library can be used by multiple ex- 
ecutables and libraries. Dynamic libraries look almost always exactly like 
executables, but they usually do not have an entrypoint. On iOS, shared 
libraries have extension .dylib, whereas executables do not have any ex- 
tension. 

Apart from standard Mach-O binaries, there are also so-called fat ones. 
Fat binaries are simply wrappers around two or more standard Mach-O bi- 
naries. They are useful for distributing applications for different architec- 
tures. On iOS, they are commonly used to distribute machine code for 32-bit 
and 64-bit [ARM] processors to enable applications to run on both older and 
newer iPhones. 


1.1.1 Mach-O file format 


Mach-O is quite standard binary format, not so different from its coun- 
terparts used on other operating systems. It is mostly well documented, 
too [App09b]. For a sample binary's representation in this format, refer to 
Figure As many file formats, it starts with so-called magic, a unique 
sequence of bytes that determine the file’s type (whether it is standard or 
fat binary) and endianness. Then, there is a header describing the binary’s 
structure. 

Mach-O format introduces quite flexible load commands. They are read 
by dynamic loader when it loads the binary, hence their name. There are 
several kinds of them, each has a specific meaning. They are stored in the 
binary’s header. 

Like and [ELF] binaries have sections, Mach-O binaries have 
both segments and sections. Segments are top-level constructs (described 
by LC_SEGMENT load commands) and sections are contained inside them. By 
convention, there is usually segment __ TEXT (with section text) contain- 
ing code and segment __DATA containing multiple sections with constants, 
global variables, runtime data structures and other data. 

Apart from load commands specifying segments of the binary, there are 
many more, e.g., LC LOAD DYLIB commands specifying dependent libraries, 
command LC_SYMTAB pointing to the binary’s symbol table, etc. 


1.1.2 .ipa file format 


The executable binary is not enough to constitute full-fledged iOS applica- 
tion. Usually, there are also other third-party libraries the app uses, some 
metadata and, last but not least, also app icons [App16]. 

All these assets are packed in an [iPhone application (IPA)|file, i.e., a file 
with extension . ipa [iPh13]. An{IPA]package is distributed via Apple's App- 
Store or by other means and contains everything the app needs in order to 
run. [TPA]files are simply .zip packages with a standard directory structure. 
Content of such [TPA]file is input of our emulator. Instructions for getting an 
[[PA|file are provided in Appendix[A.1] 

Most important part of an file for the emulator is the main ex- 
ecutable binary. It can be found simply by following a naming conven- 
tion. Imagine an iPhone application called MyApp. Its main executable 
is MyApp.ipa/Payload/MyApp.app/MyApp. Working directory of this exe- 
cutable is MyApp. ipa/Payload/MyApp.app/, effectively meaning that when- 
ever MyApp works with relative paths, they are relative to that directory. 
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Figure 1.1: Structure of a typical Mach-O binary (simplified) 


1.13 Execution of Mach-O binaries 


When an iPhone application is launched, the system needs to load the binary 


into memory, do some preparations and then execute it [App09a]. This is a 
job of so-called dynamic loader, which is program dyld on iO 3! 


Loading segments First, dyld maps segments into memory with their 
respective permissions (e.g., the __ TEXT segment is usually marked as exe- 
cutable but not writable, whereas the _ DATA segment will probably not be 
marked as executable). Sections are not treated specially by the dynamic 
loader, they just logically partition segments (and user programs can use 
this partitioning however they want). 

Segments are always continuous and mapped into memory as a whole. 
Usually, all segments are also mapped into memory together, without any 
gaps between them. However, a segment in the binary can be smaller than 
its actual size when mapped into memory at runtime. For example, segment 
__PAGEZERO commonly used in executables has no contents specified in the 
binary (its file size is zero) but takes up a page of virtual memory at run- 
time. It is set up so that the executable can neither read nor write to this 
segment and it is usually mapped at address zero, effectively ensuring that 
NULL pointer dereferences fail predictably?| Note that since this segment 
cannot be accessed, it does not take up any physical memory at runtime. 

See Figure for two sample Mach-O binaries as they would appear 
when loaded into memory at runtime. Note that Mach-O headers are also 
loaded and available at runtime, usually inside the __ TEXT segment. 


Relocation The previously described mapping of segments into virtual 
memory is done for the main executable binary and for any dependent li- 
braries, as well. Obviously, all these binaries need to share the same address 
space, so they need to be loaded at different offsets for different applications. 
But library binaries are compiled once and may contain some absolute ad- 
dresses (effectively addresses relative to their beginnings). When loading 
these libraries, dyld must shift such absolute addresses, so that they point 
to the correct places in memory at runtime. This process (shifting addresses 
after loading the binary) is called relocation. 

There is a special segment called __LINKEDIT in Mach-O binaries where 
the dynamic loader finds list of addresses that need to be relocated [[Lev13]. 


I More precisely, it is actually a library which first loads itself into the address space of the 
target binary before loading the binary itself. This process is called bootstrapping [Ras12]. 

2And not only dereferences of NULL pointers, but also dereferences of small non-NULL 
pointers which can occur due to invalid pointer arithmetic (e.g., subtraction of two pointers). 
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Figure 1.2: Mach-O executable and its dependent library loaded at runtime 
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Binding The next step is to load dependent libraries (and their depen- 
dencies, recursively). They are listed in header of a Mach-O binary using 
LC_LOAD_DYLIB commands and can be both system and user libraries (the 
latter are present inside the[[PA] package). 

When all binaries are loaded and relocated, the dynamic loader pro- 
ceeds to the final step—it links them together. Essentially, runtime link- 
ing (also called binding) means connecting symbols one library needs (these 
are called imports) to symbols another library provides (these are called ex- 
ports). Again, list of imports and exports is found in aforementioned segment 
__LINKEDIT. 

Technically, in place of imported addresses are just zeros in the binary 
file and the task of the dynamic loader is to rewrite them with the actual 
addresses of symbols exported from other libraries. Since this happens at 
runtime (these zeros are only rewritten in memory, not in the original bi- 
nary), the dynamic loader already knows where the target library (the one 
exporting the symbol) lies in memory, which is obviously necessary to deter- 
mine the symbol’s actual address. 


Code execution Finally, dyld starts executing code of the loaded binary. 
The execution starts at a location specified by load command LC_THREAD, 
LC UNIXTHREAD or LC MAIN—it is called an entrypoint. In order to start the 
execution, dyld allocates stack, points stack pointer to it and calls the en- 
trypoint. 

Apart from command line arguments (as is standard for C main func- 
tions), iOS entrypoints also receive list of environment variables. Their sig- 
nature then looks like this: 


int main(int argc, char **argv, char **envp); 


1.2 Objective-C 


Nowadays, applications meant to run on iOS are written mostly in Swift or 
Objective-C [GEBL15]. Both of these languages are high-level ones 
and as such they need supporting runtime libraries. Since the emulator 
only supports apps that use Objective-C runtime, we do not consider Swift 
in the following discussion, but the principles generally hold for Swift and 
its runtime, as well. 


12 


KU ge 
class Bar { 
public: 
int foo(int Val) 4 return Val == 42 ? 0: 1; } 
}; 


int main() { 
return Bar().foo(42); 
} 


/* Objective-C */ 
@interface Bar : NSObject 
- (int)foo:(int)val; 

@end 


Cimplementation Bar 

- (int)foo:(int)val { return Val == 42 ? 0: 1; } 
@end 

int main() { 


Bar *bar = [[Bar alloc] init]; 
return [bar foo:42]; 


Figure 1.3: Object-oriented features of C++ and Objective-C 
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1.2.1 Basic syntax 


Objective-C adds object-oriented concepts to classic C while maintaining full 
backwards compatibility [Vá13]. It is widely used in development of appli- 
cations for Apple devices. To support object-oriented development, it intro- 
duces several new keywords which are prefixed with at sign (0). Notably, 
there is a concept of classes which are declared using Cinterface and de- 
fined using Cimplementation. Usually, declarations are located in .h files 
(just like in classic C) and definitions in .m filed?] 

As is common in object-oriented languages, classes in Objective-C can 
have methods, both static and instance ones. Since Objective-C is derived 
from Smalltalk [GR83], it uses so-called message passing instead of method 
calling. If we call a method foo with parameter 42 on object Bar in C++ as 
Bar.foo(42), we write [Bar foo:42] in Objective-C for the same call. 

Another difference apparent from Figure [1.3] arises when creating class 
instances. In C++, it is possible to create instances in-place (i.e., on stack), 
whereas in Objective-C, we can create instances only on the heap. Methods 
alloc and init used to instantiate objects in Objective-C are implemented 
by class NSObject in Objective-C runtime. That is also why NSObject is 
usually the root ancestor of all user-defined Objective-C classes. 


1.2.2 Runtime library 


Message passing and method calling usually do the same thing, but the for- 
mer is much more dynamic. For example, it is possible to send arbitrary 
messages to arbitrary objects, even if these objects do not have the corre- 
sponding method to handle those messages. However, such messages can 
be handled at runtime by dynamically inserting a method to the receiving 
class. In order for all that to be possible, every method call in Objective-C 
must go through Objective-C runtime library. 

This library is called libobjc.A.dylib on Apple's systems, but there 
are also independent implementations of Objective-C runtime elsewhere. 
For example, implementation of Objective-C from GNU exists (it is called 
GNUstep) and both and Clang 
compilers can generate code specific for it. There is also an experimental 
Objective-C runtime called Etoilé and although it has no real compiler sup- 
port, the paper [Chi09], which it is presented in, contains some interesting 
concepts and ideas for Objective-C runtime implementation. 


3There is also Objective-C++, union of Objective-C and C++, containing object-oriented 
concepts of both languages |App10]. Its files have usually extension .mm. 
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Let us now consider Apple’s Objective-C runtime (the one implemented 
in libobjc.A.dy1ib]!] When compiling code such as [bar foo:42], ma- 
chine code equivalent to the following C code is generated by the compiler: 


int result = (int)objc msgSend(bar, "foo:", 42); 


Function objc_msgSend from libobjc.A.dylib ultimately finds body of 
method -[Bar foo:] and invokes it with parameter 42. But in order to 
do that, it needs some information about class Bar, e.g., list of its methods. 
These metadata are generated by the compiler and are stored inside the 
binary. For example, list of all classes in a binary is stored inside section 
__objc_classlist in segment _ DATA. 

Format of these metadata depends on the runtime used (that is also 
why every Objective-C runtime needs a special compiler support) and dif- 
fers even between two versions of Apple’s runtime (plain old Objective-C 
and Objective-C 2.0 [Pla06|]). 

Furthermore, Objective-C runtime needs to cooperate closely with the 
dynamic loader (discussed in Section[1.1.3), at least on Apple's systems. That 
is because Apple's metadata require some preparation before they can be 
used. One example of this are selectors—basically method names (e.g., in 
[bar foo:42], "foo: " is the selector; note that it is the second argument of 
objc_msgSend). In essence, they are just C strings (const char *), but they 
have to be unique—i.e., the same selector must only exist once in memory at 
runtime. And this has to hold across library boundaries, as well. So, if there 
are two identical selectors in two different libraries, the Objective-C runtime 
must replace one of them with pointer to the other one at load time] 


1.2.3 Frameworks and system libraries 


With Objective-C runtime alone, there is not much iOS applications can do. 
Typical iOS app needs to at least draw something on the display. For that 
and other communication with the operating system, there exist so-called 
frameworks. They are shared libraries written in Objective-C, exposing iOS’s 
functionality in an object-oriented way. 

Frameworks are located inside /System/Library/Frameworks/ [App13l. 
There are also other system libraries on iOS, located in /usr/lib/. These 


^[t is open-source and the code is available at https: //opensource.apple.com/ 
source/objc4/ 


ere is a special section called __objc_selrefs containing pointers to selectors. These 
pointers are actually referenced from the code instead of the actual selectors. That way, the 
runtime can replace only these pointers and the code automagically picks up the unique 
selectors everywhere. 
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other libraries are classic libraries, exposing plain C functions (with the no- 
table exception being libobjc.A.dylib). 
A few examples of iOS frameworks are listed below: 


¢ Fundamental Objective-C types are defined in CoreFoundation and 
Foundation. These frameworks include numeric types, strings, collec- 
tions, etc. Types there usually have prefix NS, e.g., NSInteger (object- 
backed integer), NSString and NSArray. 


* User interfaces can be created with framework UIKit. It has types 
like UTApplication representing the whole application and UIButton, 


UITextField and similar for standard user interface (UD|controls. 


* There are also more specific frameworks—e.g., MapKit that developers 
can use to embed maps in their applications, AudioToolbox for audio 
recording and playback, etc. For more examples, see Apple’s documen- 


tation [App15]. 
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2. Cross-platform development 


To reach as many users as possible with an application, it needs to support 
multiple platforms. Usually, potential users might want to use the appli- 
cation on their mobile phones, personal computers or even just in a web 
browser, without actually installing it. However, different development en- 
vironments and tools (and therefore also programming languages) are used 
to create applications for different platforms. Thus, to create multi-platform 
applications, developers often write the same application multiple times, 
each time producing different code for different platforms but ideally with 
the same behavior across all the platforms. 


Writing an application that runs on multiple platforms is called cross- 
platform development. To be more efficient and less error-prone, developers 
tend to share as much code as possible among the platforms they target. Ide- 
ally, all the code is shared and the resulting applications still feel like they 
are native to their respective platforms—i.e., they use facilities of the oper- 
ating system they run on but that might not be available on other operating 
systems. For example, Live Tiles on Windows and widgets on Android both 
fit the same purpose but are used quite differently and thus sharing code for 
them might be difficult. 


Now consider the following situation. Let P be a piece of computer soft- 
ware (it might be an application, operating system, library, etc.). Suppose it 
is written for some specific computer system A. A can be a specific hardware 
setup in case P is an operating system, or it might be an operating system 
in case P is a user-space application, etc. 


Furthermore, let B be a new computer system which is incompatible with 
A. Incompatible here means that P cannot just be taken without modifica- 
tions and run on B like it runs on A. For example, let P be an operating 
system and A, B two different pieces of hardware, both having completely 
different processors with different instruction sets. Then operating system 
P compiled into machine instructions of A cannot be run by B which expects 
completely different instructions. 


To run P on B, there are several options. In general, a different program 
PP needs to be created. PP should run on B but behave similarly as P on 
A. In this chapter, we discuss some of the most common ways usually taken 
to tackle the problem of creating P* from P. The topics discussed in this 
chapter are summarized in Figure 2.1] 
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Cross-platform development 
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WinObjC 


Full-system 


Wine, Darling, 
our emulator 
Figure 2.1: Cross-platform development options 


2.1 Emulation 


We might simply want to execute P on B unchanged. In that case, no ac- 
tual transformation of P is necessary, but there still needs to be some extra 
layer between P and B that makes the execution possible and it is called an 
emulator [Tuc65]. 

The simplest emulator has an instance of A in memory—i.e., its state 
relevant to the emulation, e.g., value of its 
registers, contents of its etc. This is called 


an emulation context. 

The emulator simply interprets P’s instructions as the real A would 
while changing the emulation context appropriately. We say that the emu- 
lator emulates P. More advanced emulators do not just interpret, but rather 
translate P’s instructions to the instruction set of B and then execute the 
translated instructions directly, improving performance by doing so. 

Generally, an emulator can consist of several software and even hard- 
ware components. The object of emulation can also be either software or 
hardware or both. In this work, we only consider software emulators, i.e., 
set of programs and tools that enable developers to make some piece of soft- 
ware P run on some platform B without any special help from hardware. 


2.1.1 Full emulation 


Now, let P be a compiled computer program that runs on operating system 
A and let B be another, incompatible operating system, where users would 
like to run P on, as well. One approach to this is so-called full emulation. In 
full emulation, the whole operating system A is emulated on top of B. 

Full emulation is simple in theory. Let 7/4 denote set of instructions that 
A is compiled into. The corresponding emulator runs on system B, takes 
operating system A as its input and interprets A's instructions from set 
I, exactly as hardware of A would. The only thing this emulator needs to 
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understand is the set 74. The emulated operating system can be provided 
dynamically at runtime. However, the operating system is usually fixed, and 
only programs P; that are meant to run on the emulated OS are provided by 
the user. 

In order to be any useful, full emulation also needs to emulate some hard- 
ware components—e.g., display, mouse and keyboard. This setup—emulated 
operating system and hardware—is called a virtual machine [Gol74]. It is 
possible to have multiple virtual machines on a single real machine. 

Disadvantage of this approach is the need to have a whole operating 
system in memory on top of another operating system. That is obviously 
memory-consuming and as such it is often not reasonable approach on mo- 
bile devices which have limited resources. 


2.1.2 User-space emulation 


Instead of full system emulation, where the whole A is emulated, only the 
target program P can be emulated. However, typical P calls operating sys- 
tem's functions, so the emulator needs to make A’s functions available on 
B. In these cases, we do not really care about emulation, as that is the easy 
part and might not even be necessary if A and B use the same instruction 
set. The most challenging part here is making sure that P does not know it 
runs on B and not on A. 

Wine does this by exposing Windows implementing them 
via Linux calls, effectively making possible to run Windows programs on 
Linux. Darling is similar—it exposes macOS on Linux. Our 
emulator does something similar, as well, only in this case, the exposed APIs] 
reflect iOS system libraries and frameworks (see Section[1.2.3) and they are 
translated to Windows [API] calls. 

Translating|[API|calls of one operating system to|API|calls of another one 
gets more complicated if both systems run on different architectures. That is 
also our case, since we emulate iOS applications written for[ARM]processors 
in Windows running on an 1386 processor. 

For example, consider the following function call that could appear inside 
an iOS application: 


UlApplicationMain(0, NULL, nil, @"AppDelegate"); 


The presented call of function UIApplicationMain from iOS framework 
UIKit (see Section [1.2.3) takes four arguments (an int plus three pointers) 
and returns an int. On 32-bit [ARM] all these types have size 32 bits, so the 
calling convention is pretty simple in this case. Parameters are passed in 
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registers (r0, r1, r2 and r3) and return value is also located in a register 
upon return (r0 is reused for that). 

Now imagine we have a port of this function into Windows. Port is a 
function with the same signature as the original function and implementa- 
tion that behaves exactly like the original function but has different code 
since it must use the target operating system's APIs 

Since the port of UIApplicationMain would target Windows, it would 
be compiled into 1386 machine code which has a different calling convention 
than the original function. By that convention, a function with the signature 
of UIApplicationMain expects its arguments on the stack and returns its 
result in register EAX. 

So, in this particular case, the emulator could simply copy values from 
registers r0, r1, r2 and r3 of the emulation context to the stack of the host 
machine, jump to the port and then transfer the value of the host machine's 
EAX into the emulation context’s register rO upon return. However, calling 
conventions get much more complex once more advanced signatures are in- 
volved (ARM 15]. 

Translating|APl]calls between different architectures is similar to a tech- 
nique known as It also involves calling func- 
tions across potentially different architectures, although the calls are usu- 
ally transferred across different machines, as well [BN84]. [RPC] communi- 
cation is also used between microkernel servers [Fes07], e.g., in L4 Runtime 
Environment [L4R18]. However, library authors must have in mind 
when designing their and so do library users when consuming those 
Therefore, existing |[RPC]libraries are not suitable for translating calls 
into and out of emulated binaries. 

[Simplified wrapper and interface generator (SWIG)|is a tool that gener- 
ates bindings between C++ code and common scripting languages [Bea96]. 
Hence, [SWIG] understands standard calling conventions very well and has 
the potential to generate an[API|translation layer for a user-space emulator. 
Unfortunately, it is only designed to work between C++ code on one end and 
a scripting language on the other end; it cannot effectively translate calls 
between two compiled binaries. 


2.2 Compiling for multiple platforms 


Instead of emulating machine code of the compiled program P, its devel- 
oper can compile it for different platforms. If P is written in a high-level 
programming language, this can be as simple as re-invoking the compiler 
(or not doing anything at all if the language is interpreted or compiled into 
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byte-code). But if source code of P depends heavily on internals of platform 
A, then porting it to platform B (i.e., changing the source code so that it can 
be compiled for B) can be rather tedious task. 

Luckily, there are cross-platform tools that try to minimize the amount 
of porting necessary and maximize code sharing among platforms. For ex- 
ample, Xamarin enables developers to write mobile applications in one lan- 
guage (C#) and run them on multiple platforms (Windows, iOS and An- 
droid) [Dic13]. It also contains some abstractions over common operating 
system functions, so that they can be used on all target platforms in the 
same way, hence maximizing code sharing among the platforms. 


2.2.1 WinObjC 


Another cross-platform tool, WinObjC, takes a different approach. It enables 
developers to take sources of existing iOS applications and re-compile them 
for Windows [Den15]. Essentially, it consists of iOS frameworks and other 
system libraries (see Section[1.2.3) ported to Windows. The original iOS code 
(written in Objective-C) is compiled using Clang] along with the ported iOS 
libraries into a Windows application. 

WinObjC libraries are written for [UWP| Apart from classic Win32 
that contain only flat C functions, UWPlapps also use the new|Windows Run-| 
[time (WinRT)[APIs|[W* 18]. These[APIslare designed to be used from several 
programming languages. Internally, they are written as 
but language projections exist on top of the 


to expose them in a way natural to each supported programming language. 

To target [UWP] from C++, as also WinObjC does, a language projection 
called C++/WinRT can be used [W*19]. It exposes the in an 
object-oriented way as standard C++ classes—e.g., it contains [Ullrelated 
classes like Window, Button and TextBox. 

Recall from Section{1.2.3]that iOS frameworks do something similar, only 
instead of C++, they use Objective-C. And indeed, there is often correspon- 
dence between and iOS which is leveraged by WinObjC. For 
example, iOS's class UIButton is implemented in WinObjC using [WinRT|s 
class Button. 


Disadvantage of using cross-platform development tools is that develop- 
ers are forced to use special tools and compile their programs for multiple 
platforms. Typical iOS developers create applications in Xcode on Mac and 


Microsoft, author of WinObjC, uses Clang instead of its own compiler MSVC, since 
MSVC can neither parse nor compile Objective-C code. 
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// 40S definitions (not real) 
struct ComplexStructure; 
void change (ComplexStructure *CS); 


// User function 

void func() { 
DR sears 
ComplexStructure CS; 
Y4 c 
change (&CS) ; 
FF ue 


Figure 2.2: Illustration of binary-compatibility problems 


if they wanted to port these applications to Windows using WinObjC, they 
would also need to own a Windows machine with Visual Studio. Not only 
it can get resource-consuming, but it also requires the developer to learn to 
use completely different tools, as well. 


2.3 Combining it all together 


This disadvantage can be surpassed by using user-space emulation. The 
emulator could take the original iOS binary unchanged, yet it would only 
emulate code of the binary itself and call WinObjC functions instead of em- 
ulating iOS. 

WinObjC contains lots of functions that are not ported yet. So, let us 
assume that P is an iOS application that could be ported by WinObjC, i.e., it 
only uses functions from WinObjC that are implemented correctly. 

Then, all functions that P calls must behave in WinObjC exactly as they 
do on iOS. Consider P's function func from Figure[2.2] It calls iOS function 
change that acts somehow on structure ComplexStructure. Since change 
works correctly by our assumption, the code will work when compiled for the 
target architecture. But ComplexStructure might be aligned differently on 
different platforms, effectively meaning that it might be represented differ- 
ently in memory. So, the code might not work correctly when emulated, since 
the WinObjC library containing change might expect ComplexStructure to 
have a different layout than it has in P. 

Note that this kind of problems arises from how the compiler works. We 
call these problems binary-compatibility issues. Furthermore, note that the 
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only difference between working and non-working func is that that the for- 
mer is compiled differently than the latter. Therefore, problems with binary- 
compatibility are the only ones that can arise provided all WinObjC func- 
tions are working correctly. 

Another problem in WinObjC is its usage of Objective-C metadata format 
which is different from the format iOS applications use. Note that Objective- 
C metadata are generated by the compiler, so this problem does not appear 
when P is recompiled, but is there when P is emulated, hence it is also a 
binary-compatibility issue. 

However, all these problems can be solved by compiling WinObjC li- 
braries so that they are binary-compatible with iOS. Therefore, the de- 
scribed approach can work. We implemented it and details about our im- 
plementation are provided in the next chapter. 
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3. Implementation and results 


Because of the disadvantages of both compiling for multiple platforms and 
full emulation discussed in Section [2.2] and Section respectively, we 
decided to create a user-space emulator (see Section [2.1.2]. It runs on Win- 
dows, takes iOS applications, emulates them and whenever they call some 
iOS function, it translates the call to an equivalent Windows functionality. 

Our emulator works completely dynamically, i.e., once it is compiled, it 
can take any iOS application and emulate it as long as translations of all iOS 
[APT|functions that the app needs are implemented. Even better, if some iOS 
function is not implemented yet, the app can still be emulated as long as it 
does not call that function even if the function is imported by the application 
(but maybe only called in some unlikely situations). 


3.1 Overview 


As already mentioned in Chapter [1| our emulator tries to imitate real iOS's 
behavior as closely as possible. Our implementation of this behavior in the 
emulator is described throughout this chapter. 

The emulator depends on several components as illustrated in Figure[3.1| 
At compile time, lots of support code is generated to be then used at run- 
time by our emulator—see Section Furthermore, the compiler used 
throughout the project contains some important patches that we also con- 
sider to be part of the project’s implementation—these are described in Sec- 
tion 

At runtime, our emulator relies on multiple components in order to work. 
These components and their relations are sketched in Figure[3.2] A dynamic 
loader discussed in Section [3.2] loads binaries into memory. Guest binaries 
are emulated by an emulator described in Section [3.3] Native libraries in- 
clude the Objective-C runtime (see Section [3.4) and WinObjC libraries (see 
Section [3.5]. Supporting wrappers generated at runtime are both .dylibs 
and .dlls. Calls between the emulation and the native libraries go through 
a system translator. They usually go through the wrappers—that is a static 
translation and it is described in Section [3.7.1] However, they can also by- 
pass the wrappers and then a dynamic translation must occur as discussed 
in Section 3.7.2 

Note that the runtime part of the emulator is divided into two parts. The 
main functionality is contained in library IpaSimLibrary. The executable 
part is a[{UWP| application (see Section 3.5]for more details about this deci- 
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Figure 3.1: Components of our emulator and dependencies between them 
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Figure 3.2: Organization of our emulator at runtime 


sion) called IpaSimApp. This division enables us to build most of the project's 
code with one tool and build only the final executable with a different one (as 
applications are not supported by the former tool)—see Appendix [A.1] 
for more information about these tools. 


3.2 Dynamic loader 


Input to the emulator is an . ipa package (described in Section 1.1.2), which 
is provided by the user. First thing the emulator does is that it finds the main 
executable binary in the package, loads and executes it. Loading binaries is 
a job of class DynamicLoader. It can load Mach-O binaries (.dylibs and 
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executables) but also keep track of loaded |dynamic-link libraries (DLLs) 


The latter feature is useful especially for so-called Mach-O posers, i.e., 
[DLLs]that have an embedded Mach-O header. For more information about 
them, refer to Section [3.6.3] To load Mach-O binaries, our dynamic loader 
does all the steps described in Section [1.1.3] ie., it loads their segments, 
relocates them and binds external symbols. 

The dynamic loader also registers Mach-O binaries within our Objective- 
C runtime which then does some initialization on them. That is something 
the original dyld does on iOS, too, as described in Section [1.2.2} 

Before relocating and binding, just after loading all segments, the dy- 
namic loader also loads any dependent libraries. It must happen at this 
time, since the loader needs to know the exact location of all dependencies 
before binding external symbols from them. Note that mutual dependen- 
cies (i.e., when binary A depends on B, but also B depends on A) do not 
break the loader, either. That is simply because all binaries are in memory 
(and therefore their symbols' addresses are known) before any binding takes 
place. 


3.3 Emulation setup 


When the dynamic loader loads the main executable and its dependencies 
into memory, our emulator starts emulating the executable's code from its 
entrypoint. For the core emulation functionality, library Unicorn is used. 
It is based on the famous open-source emulator QEMU [Wei18], the differ- 
ence being that Unicorn can emulate also separate pieces of machine code, 
whereas QEMU is more of a full-system emulator [Q* 15]. The emulator 
communicates with Unicorn via a very simple it can read and write 
registers, map memory with specific permissions and listen to events hap- 
pening during the emulation (this is called hooking). 

The dynamic loader first maps memory within Unicorn with the follow- 
ing permissions: 


User binaries Binaries containing code that is emulated are mapped as 
requested by their Mach-O headers. This corresponds to the behavior 
of iOS’s dyld. 


System libraries Libraries that are executed natively on the target plat- 
form are also mapped into Unicorn, but the memory is marked as non- 
executable. That way, the emulated code can read and write to global 
variables and other data inside system libraries but cannot directly 
jump to their code (that would not even make sense since that code is 
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written for a different architecture). Instead, our emulator listens to 
these events—Unicorn trying to execute non-executable memory—and 
that is when the guest-to-host transitions happen (see Section [3.7] for 
more information). 


There is also another important trick deployed in the emulator. When 
mapping memory into Unicorn, it is possible to specify a virtual address at 
which the memory will be visible inside the emulation. Our emulator maps 
all the pieces of memory at their real addresses, so that data structures 
are valid across the host-guest boundary (especially pointers to other data 
structures in memory). 


3.4 Library objc 


The main executable binary could not do much without system libraries it 
depends on. These libraries provide the application access to system's re- 
sources (generally input and output, e.g., keyboard and display). One such 
library that all apps written in Objective-C need is a runtime library. As 
we may recall from Section [1.2.2] this library uses metadata embedded in 
binaries and these metadata have a specific binary format which no existing 
Objective-C runtime library implementation for Windows uses. Hence, we 
decided to compile Apple's Objective-C runtime library (libobjc.A.dylib) 
for Windows. We call it simply objc. 

Alternative approach to this problem would be to simply take the Ap- 
ple’s library and emulate it, as well. But as we will see later in Section 3.5] 
WinObjC frameworks also link to an Objective-C runtime library and then 
our objc will come handy. 

Most of libobjc.A.dylib is written in C++, so porting it just means 
replacing iOS function calls with Windows ones. Fortunately, the runtime 
does not depend very much on operating system’s functionality. But it uses 
threading a lot, which in older versions of C++ and on iOS means using 
Portable Operating System Interface (POSIX)|threads, a.k.a. pthreads. Our 
objc uses pthreads-win32 which is a library providing pthreads |APIs| on 
Windows [Joh12]. 

Furthermore, the message-passing functionality of libobjc.A.dylib is 
written in assembly code. When building our objc, these pieces of assembly 
code are first compiled into Mach-O object files and then converted to 


object files using utility objconv [Fog18]. 
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3.5 WinObjC libraries 


All remaining operating system functionality that iOS applications depend 
on is contained inside iOS frameworks and other system libraries (see Sec- 
tion 11.2.3). To have iOS frameworks available on Windows, the emulator 
uses WinObjC libraries (see Section 2.2.1). 

WinObjC libraries use and are written in Objective-C++ 
(to expose iOS-like [APIs|in Objective-C and use in C++ at the 
same time) and thus use the C++/WinRT language projection. Furthermore, 
are in their entirety only available for [UWP] Hence, our emu- 
lator must be a[UWPl|application in order to be compatible with WinObjC. 

Original WinObjC libraries use GNUstep Objective-C runtime. As al- 
ready mentioned in Section [1.2.2] GNUsStep runtime uses its own metadata 
format. However, our runtime objc does not understand that format. 

Thus, the emulator needs to use WinObjC not only linked against our 
objc, but also compiled so that the correct metadata format is generated. 
Fortunately, compiler Clang does exactly that when given command-line op- 
tion -fobjc-runtime-ios-11 even on Windows. 

Originally, WinObjC also defined class NSObject (root class of almost all 
other Objective-C classes), since GNUstep runtime does not have its own. 
But our objc has its own NSObject, hence we also patched WinObjC so 
that it uses objc’s NSObject. Furthermore, WinObjC libraries originally 
used their own pthreads implementation and, to be unified with objc, we 
patched them to use pthreads-win32, as well. 


3.6 Clang modifications 


Quite surprisingly, Clang can generate object files (recall from Sec- 
tion[1.1|that these are only used on Windows) that are compatible with Ap- 
ple’s Objective-C runtime. For example, Clang recognizes Windows-specific 
dllimport attributd!] at classes even in parts of the compiler that are used 
only when compiling for Apple's Objective-C runtime?] 

Unfortunately, this implementation is not complete, and we needed to 
patch Clang in order for it to generate [COFF] object files fully working with 
Apple-compatible Objective-C runtime (in our case, library objc). We use 
the patched Clang to build our emulator and libraries the emulator uses 


I We call it shortly “dllimport attribute” but to actually use it in code, we would have to 
write more long-winded | declspec(dllimport) in a class or function declaration. 

?See method CGObjCNonFragileABIMac: :GetClassGlobal in src/deps/clang/lib/ 
CodeGen/CGOb jCMac. cpp. 
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(e.g., WinObjC frameworks). In this section, we describe some of these 
patches. 


3.6.1 Extending support for attribute dllimport 


On Windows, the dllimport attribute is used to mark functions and global 
variables that reside in another shared library (a .d11 file). There is a sec- 
tion in the binary which is filled at runtime by the dynamic loader with the 
actual addresses of all imported symbols. The compiled code uses these ad- 
dresses to access the imported variables. Hence, the compiler needs to know 
which variables are dllimport-ed, since they are accessed indirectly. To pre- 
vent accessing them directly, like is done with variables residing in the same 
binary, their names are prefixed with — imp . 

Clang recognizes the dllimport attribute at a class and saves infor- 
mation about its presence in an object representing global variable for the 
class. However, it then fails to use this information when actually emitting 
the global variable into the output file. Our patched Clang emits 
dllimport-ed variables correctly (it simply adds the __imp_ prefix)—the 
patch can be seen in method TargetMachine: : getSymbol in source file src/ 
deps/llvm/1lib/Target/TargetMachine. cpp. 


3.6.2 Section  fixbind 


Translating the capabilities of 10S's dynamic loader into the Windows envi- 
ronment creates a challenge when encountering external symbol references 
in .dlls compiled from Objective-C code. For example, suppose there is a 
class A in A.dylib which is a subclass of class B from B.dylib. Then there 
would be a pointer from A’s metadata to B’s metadata to denote the inheri- 
tance. On iOS, the dynamic loader can bind external symbols wherever the 
library wants them, so this would work correctly. But on Windows, imported 
symbols can only be in a continuous list, as mentioned before. 

Thus, the patched Clang collects all those binding targets that are lo- 
cated outside the standard import list. Pointers to them are emitted into 
section fixbind. Our dynamic loader makes sure they are bound correctly 
at runtime when the binary is loaded. 


3.6.3 Generating Mach-O headers 


Another change to the patched Clang is about the Mach-O file format (see 
Section|1.1.1), especially Mach-O headers. Apple’s Objective-C runtime (and 
therefore also library objc) uses Mach-O headers when initializing binaries 
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(see Section [1.2.2]for more details about this initialization process). Mostly, 
it uses the header to search for sections inside the binary. For example, 
section objc classlist contains list of classes in the binary (it is part 
of the metadata generated by the compiler). The runtime uses that list to 
initialize Objective-C classes, e.g., to call their static constructors. 

Our patched Clang (or, to be more precise, Clang’s linker, 11d) contains 
code that generates Mach-O headers even for [PE] binaries (i.e., .d11 files)— 
we call such libraries Mach-O posers. Thanks to that, Objective-C classes of 
WinObjC frameworks can be initialized correctly by our objc. 


3.7 API translation 


Our emulator now has at hand a set of libraries running natively on Win- 
dows providing Windows functionality via iOS-compatible We call 
these our system libraries. There are also libraries and an executable that 
come from the input [PA] package, we call these user libraries and a user ex- 
ecutable, respectively. The only thing that remains is connecting our system 
libraries with the user executable and user libraries. 

The emulator is compiled for and runs on a host machine. The input 
user binaries are compiled for a guest machine and the emulator interprets 
instruction set of the guest machine in order to execute the user binaries. 
For simplicity, we will restrict ourselves to i886 Windows hosts and 


version 7 (ARMv7) guests?lin the following discussion but the principles can 


be easily generalized for other platforms, as well. 


So, our system libraries run natively on the host machine and the user 
binaries run on an emulated guest machine. Our emulator must be able to 
go from emulating the user binaries to executing our system libraries and 
back. Note that the boundary between system and user libraries is well 
defined beforehand since the system libraries are completely independent 
from the user binaries. That also means that the guest-to-host and host-to- 
guest transitions only happen when calling well-known functions (these are 
either functions from iOS[APIs]or user callbacks that might be called by our 
system libraries). Therefore, the emulator can determine signature of the 
function being called at each transition and translate the call as described 
in Section [2.1.2 


?On i386 processors, calling conventions depend on operating system, whereas on 
processors, they do not. Therefore, we did not need to specify operating system of the guest 
platform, because we are only concerned about calling conventions in this section. 
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3.7.1 Static translation 


To deal with different calling conventions of the host and the guest when 
translating function calls, our emulator uses Clang again, this time as a li- 
brary. At compile time, special wrapper functions are generated that trans- 
fer function arguments across the guest-to-host boundary. This is a job of a 
tool called HeadersAnalyzex which runs at compile time producing outputs 
that our emulator uses at runtime. 


API analysis 


To do its job, the analyzer needs to acquire a list of all functions which the 
emulator might wish to translate calls of. It simply reads .tbd files from i0S 
which are YAML documents 
listing exports of iOS system libraries. To append signature information to 
the list of exported functions (needed later when generating wrappers), it 
also analyzes relevant C++ headers from the same[SDK] 

This analysis is also done using the Clang library. The headers are com- 
piled by Clang into LLVM This represen- 
tation has the necessary type information that our analyzer needs later in 
the process, when generating wrappers. Unfortunately, simply compiling 
headers to LLVM discards any unused declarations but the analyzer 
needs them (after all, it is only concerned about declarations). Thus, we 
also patched Clang to emit all declarations into the [TR]if it is told so. 

Next, the analyzer inspects our system libraries (i.e., WinObjC frame- 
works and our objc). To every function found in Apple's libraries in the 
previous step, one of our system .d11s that implements that function is as- 
signed. This mapping is useful when generating wrappers. 


Code generation 


As a final step, the analyzer generates wrappers, two for every function that 
the emulator might want to translate. Figure [3.3] has an example of wrap- 
pers generated for a sample function. 

Wrappers of the first kind are generated for the guest architecture. To- 
gether, they are packed into .dylibs corresponding to .dylibs that are 
present on the real iOS. They are then simply loaded by the dynamic loader 
as any other .dylib and this way, iOS system functions are runtime-linked 
to user binaries. 


^It used to only analyze C++ headers when it was created, hence its name. It still does 
that but also much more. 
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// Signature of the real 10S function and also of 
// the implementation inside some system library 
// (e.g., a WinÜbjC framework) 

int foo(int a, float b); 


// The 108, ARMv7 wrapper (in a .dylib) 
int foo(int a, float b) { 
struct { 
int *a; 
float *b; 
int retval; 


S 
.a = &a; 
b &b; 

ipaSim, wrapper foo(&£&s); 
return s.retval; 


// The Windows, $386 wrapper (in a .dll) 
void $  ipaSim wrapper foo(void *args) { 
struct { 
int *a; 
float *b; 
int retval; 
} *argsp = (decltype(argsp))args; 
// Call the actual implementation. 
argsp->retval = foo(*argsp->a, *argsp->b); 


} 


Figure 3.3: Two wrappers generated for function foo 


33 


Calling conventions can get quite complex and it would be difficult to 
extract the arguments manually. Instead, the .dylib wrappers take their 
arguments and store them in the memory in a known structure. The wrap- 
pers are compiled with Clang which knows calling conventions best. 

From a wrapper of the first kind, corresponding wrapper of the second 
kind is called. These wrappers are generated for the host architecture and 
are packed into .dlls corresponding to our system libraries. The dynamic 
loader loads these like other .dlls, i.e., as non-executable memory. It also 
runtime-links them to the wrappers of the first kind (wrappers of the first 
kind import functions $__ipaSim_wrapper_* and wrappers of the second 
kind export them—so, the dynamic loader links them normally). 

Our system libraries also export data symbols (e.g., global variables). 
These are simply re-exported from the wrapper .dylibs, so that user bina- 
ries that link to our wrappers get the actual data from our system libraries. 


Calling wrappers 


When Unicorn reaches a call ofa $__ipaSim_wrapper_* function, it tries to 
fetch a non-executable memory. Our emulator catches this event and trans- 
lates the call. This is simple since there is only one argument—the structure 
pointer—and so it is quite easy to determine the calling convention of that 
call. Our emulator extracts this pointer according to that calling convention 
and simply calls the .d11 wrapper passing it the pointer as its only argu- 
ment. Translating calls is a job of our emulator’s class SysTranslator. 

As can be seen in Figure[3.3| the wrapper of the second kind extracts ar- 
guments from the known structure and calls the real .d11 function. Again, 
Clang’s knowledge of calling conventions is used here. 

However, not all functions from WinObjC libraries are exported in the 
standard way. Objective-C methods are accessible only through the meta- 
data; they are not present in library's list of exported functions. So, it is 
possible for the emulated code to jump directly to a native function (e.g., a 
method whose address it retrieved from the metadata), skipping the wrap- 
pers. Thus, every wrapper .dll contains a map from all possible func- 
tions (not only exported, but methods, as well) to the corresponding wrapper 
. dylibs. That way, the emulator can use a wrapper to transfer the function's 
arguments even when the code tries to bypass it. 


3.7.2 Dynamic translation 


Some methods are not picked up by HeadersAnalyzer, e.g., methods added 
dynamically at runtime. To translate them, our emulator implements also 
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dynamic translation (as opposite to static translation via wrappers). This 
dynamic translation can handle only simple function signatures and works 
only for Objective-C methods. That is because their signatures can be found 
in the metadata at runtime. The dynamic translation is implemented in 
class DynamicCaller. 

So far, we have only solved guest-to-host transitions, but what about 
host-to-guest transitions, i.e., when a .d11 wants to call back into the em- 
ulated code? Since these callbacks are rare and use only simple types, no 
wrappers are generated for them, all the translations are done dynamically, 
at runtime, without the help of Clang. There are actually two different ways 
callbacks can be handled: 


Trampolines If a pointer to the emulated function is needed by a .d11 
function, a trampoline is generated. Trampolines are small functions 
that are generated at runtime in a chunk of executable memory. Li- 
brary libffi is used to generate them. 


Every trampoline remembers address of the function it wraps; there- 
fore, when the trampoline is called, it can transfer its arguments into 
the emulator and jump to that address to actually execute the emu- 
lated function. Remember that callbacks have only simple signatures, 
so the calling convention and hence the translation is quite easy here. 


Direct calls Ifthe native code only calls the emulated function and does not 
need to store its pointer anywhere, a different approach is used. The 
actual transition is similar to the previous case, but this time, it is im- 
plemented in templated methods in class DynamicBackCaller. Tem- 
plate arguments contain types of the callback’s parameters, so that for 
distinct callback signatures, distinct specializations are used. 


These specializations are called from our system libraries whenever 
they want to invoke some callback through a pointer. The pointer is 
passed as the first argument and the rest are arguments for the call- 
back. The emulator checks whether the pointer points into the emu- 
lated code—then the described translation is used—or outside of it— 
then the callback is called normally. 


3.8 Evaluation 
In this section, we evaluate our implementation in terms of completeness 
and performance. All binaries used in this section can be built by following 


instructions provided in Appendix[A.1] 
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Crucial part of our emulator, HeadersAnalyzer (see Section 3.7.1), tries 
to generate wrappers for as many iOS system functions as possible. Symbols 
whose signature cannot be determined are treated as data symbols rather 
than functions. Some of them are really data, others are just unrecognized 
functions. Furthermore, variadic functions and Objective-C messenger] 
need to be handled individually and HeadersAnalyzer only processes some 
of them. Figure[3.4]summarizes results of HeadersAnalyzer. 

The emulator itself—i.e., program IpaSimApp—is complete enough to 
run simple iOS applications. Instructions for running the program and 
building iOS applications suitable as the program’s inputs are laid out in Ap- 
pendix[A.1] A few sample iOS applications are available in Attachment[B.1] 
One of these sample applications being emulated is shown in Figure[3.5| 

To evaluate performance, we have created an iOS application that mea- 
sures performance of multiple areas in the emulator. It is one of the sample 
applications and it is called IpasimBenchmark. All the measurements were 
performed on a machine with processor Intel® Core™ i7-8550U. 

The benchmark application was run in our emulator and on the iPhone 
Simulator in Xcode. Running an application on the iPhone Simulator should 
be like running it on any real iPhone, at least for purposes of this evaluation. 

Several functions and Objective-C methods are run by the benchmark 
and time of their execution is measured. Every method is actually run 
twenty thousand times and average execution time is reported. The loop 
performing the executions and measurements is inside a system library 
(dispatch), so it is not emulated but always executed natively. 

Table shows results of the benchmark. Function object_isClass 
is from the Objective-C runtime library and its implementation is trivial. 
Since it is a native function, there is no overhead caused by our emula- 
tor. Another native function—objc_getClass—has also no emulator-caused 
overhead, but its implementation is more complex, so it takes more time to 
execute on the iPhone Simulator. 

User function staticNoop is defined inside IpasimBenchmark and does 
absolutely nothing. Because it is defined inside the user code, our emulator 
has to call it dynamically (as described in Section [3.7.2) and so there is a 
small overhead. Method -[noop] does nothing, as well, but this time it 
is called through the Objective-C runtime, so there is an overhead of that. 
Since it involves calling function objc msgSend which is in our emulator 
handled in a way that involves multiple transitions between the host and 
the guest, it has several times bigger overhead on Windows than on the 


5Objective-C messengers are functions similar to function objc  msgSend which is de- 


scribed in Section 
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Not found Found 


Data Function 


Unhandled Handled 


Vararg Msg 


Figure 3.4: Different kinds of symbols analyzed by HeadersAnalyzer. The 
first graph from the top shows the distribution of all symbols found in iOS 
headers. These were either found or not found in one of our system .d11s. Of 
symbols that were found, the second graph shows the distribution of func- 
tions and data among them. The third graph shows the distribution of han- 
dled functions (i.e., functions whose wrappers were generated successfully) 
and unhandled ones. A function can be unhandled because it is variadic 
(Vararg) or an Objective-C messenger (Msg)—the distribution of these func- 
tions is shown in the last graph. 
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Info: loading library C:\Users\jjone\AppData\Local\Packages\0ee863f9-dcc5-4c 
Info: loading library gen\System\Library\Frameworks\Foundation.framework\Fc 
Info: loading library gen\Foundation.wrapper.dll... 

Info: loading library gen\UIKit.wrapper.dll... 

Info: loading library Foundation.all... 

Info: loading library UlKit.dll... 

Info: loading library gen\usr\lib\libobjc.A.dylib... 

Info: loading library gen\libobjc.wrapper.dll... 

Info: loading library libobjc.dll... 

Info: loading library gen\usr\lib\libSystem.B.dylib... 

Info: loading library gen\Starboard.wrapper.dll... 

Info: loading library gen\CoreFoundation.wrapper.dll... 

Info: loading library gen\ucrtbase.wrapper.dll... 

29 Info: loading library gen\libdispatch.wrapper.dll... 

Info: loading library ucrtbase.dll... 

Info: loading library Starboard.dll... 

Info: loading library libdispatch.dll... 

Info: loading library CoreFoundation.dll... 

Info: loading library gen\System\Library\Frameworks\CoreFoundation.framewc 
Info: loading library gen\CFNetwork.wrapper.dil... 

Info: loading library CFNetwork.dll... 

Info: loading library gen\System\Library\Frameworks\UIKit.framework\UIKit... 


iPhone 8 Plus - ¡OS 11.1 


Figure 3.5: Screenshots of a sample application. In the top left corner is 
our emulator displaying the emulated application in one window. In the top 
right corner is our emulator displaying logging messages in another window. 
Below is the same application running on the iPhone Simulator. 
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Windows iPhone Simulator 


object_isClass 0 ms 1 ms 
objc_getClass 0 ms 41 ms 
staticNoop 18 ms 0 ms 
- [noop] 107 ms 6 ms 
- [noSysca11s] 86 ms 58 ms 
-[viewWillLayoutSubviews] 1180 ms 478 ms 


Table 3.1: Average execution time of selected functions in milliseconds. 


iPhone Simulator. 

Method -[noSyscalls] is also user-defined, but this time it contains 
some non-trivial code. However, it does not call any system functions, i.e., 
there are no guest-to-host transitions being handled by our emulator. Nev- 
ertheless, it still must be called from dispatch, which is a native library, 
so there is an overhead of one host-to-guest transition. On the other hand, 
method -[viewWillLayoutSubviews] contains almost nothing apart from 
calls into system libraries, so there is an overhead of many transitions be- 
tween the host and the guest. 
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Conclusion 


In this thesis, we presented an iOS emulator for Windows. It can run on 
desktop computers that have Windows 10 installed. When launched, it 
asks user for a compiled iOS application which is then emulated. Note that 
sources of the iOS application are not needed, only its binary similar to the 
one distributed for Apple’s AppStore is enough for the emulator to work. 
Hence, developers do not have to employ any complex workflow to have their 
applications supported by our emulator. 

The emulation itself is done at the user-space level. This means that 
neither iOS kernel nor system libraries are emulated. Instead, our emula- 
tor emulates only code of the iOS application and calls to iOS functions are 
translated to Windows calls. In effect, iOS functionality and [UI] ele- 
ments are translated into equivalent functionality in Windows if available, 
making the emulated application feel like a native Windows application. 

Unfortunately, writing a complete translation of the standard iOS library 
is equivalent to replicating a large portion of the effort Apple put into its de- 
velopment, which greatly surpasses the scope of this thesis. As a result, only 
very simple applications are supported by our emulator. However, the con- 
cepts presented in this thesis are the crucial part of the emulator; translat- 
ing[API]| calls is mostly straightforward but extremely time-consuming task. 


Future work 
Translating more|APIs|requires work in several areas: 


* More functions need to be covered by our compile-time tool which gen- 
erates code that transfers function arguments between the host and 
the guest platform. Currently, some Objective-C methods are not even 
discovered and hence not analyzed any further. They could be dis- 
covered by analyzing more Objective-C metadata. Nevertheless, those 
methods can be translated dynamically by our emulator but only sim- 
ple ones (as explained in Section (3.7.2). Furthermore, some special 
kinds of functions are not supported at all, e.g., variadic functions. 


* Translation layer needs to implement more functionality. Our emula- 
tor uses WinObjC libraries as this layer and they currently contain lots 
of iOS functions that are implemented only partially or not at all. Since 
iOS and Windows 10 provide similar functionality, implementing the 
translations should not involve much complexity. However, iOS 
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contain many functions and hence translating them all would take a 
considerable amount of time. 


e All callbacks in our system libraries need to be handled. Since call- 
backs can point to emulated code, they also need translation. However, 
to translate them, source code of WinObjC and other libraries must be 
patched at points where any callbacks are called. These points can 
only be found by manual inspection of the code. 


* Recall that Objective-C runtime needed patching because of binary- 
compatibility issues described in Section 2.3] Similarly, threading li- 
brary dispatch also needs proper patching since it is binary incompat- 
ible with library dispatch which WinObjC libraries use. Currently, it 
is patched only partially so that at least some functionality is sup- 
ported. 


Besides translating more there are also other areas suitable for 
future work on the emulator: 


* Since the emulator is written for UWP] it has the potential to run on 
various Windows devices, e.g., phones, tablets and even gaming con- 
soles. Currently, the emulator is only compiled for architecture 1386. 
To support more devices, the emulator could be compiled for archi- 
tecture [ARM] and 64-bit processors, as well. Sometimes, source code 
would have to be changed to support different architectures, notably 
our compile-time tool which generates code (it would need to generate 
code for different architectures, too). 


¢ Only applications written in language Objective-C are currently sup- 
ported. However, Apple introduced language Swift as a replacement 
for Objective-C. Therefore, in order to support also modern applica- 
tions written in Swift, its runtime ported for Windows should be in- 
cluded in our emulator. 


¢ User interface of the emulator is minimal. After all main syscalls are 
implemented, it would be viable to improve the user interface to make 
running the emulated applications simpler and more seamless. 
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A. Appendices 


A.1 Installation 


Building the emulator is a multi-step process. In this section, we describe all 
steps necessary to get the emulator up and running. Files used throughout 
this section are available in Attachment [B.1] All paths in this section are 
relative to the root folder of that attachment unless stated otherwise. 


Getting prerequisites The source code of the whole project depends on 
lots of third-party libraries which are often built from source if their patch- 
ing was necessary. Because of these libraries and their diverse require- 
ments, the development environment needs to be carefully prepared in order 
to build the source code correctly. To ease the preparation and also formally 
define the development environment, we use Docker [[Doc19]. 

Docker allows us to specify prerequisites for compiling the code and also 
a precise way of acquiring them, so that the resulting environment can be 
created on multiple machines while still containing precisely what our pro- 
gram needs to be built. The environment is defined in files src/Dockerfile 
and src/docker-compose. yml. 

Hence, there are only two prerequisites that need to be installed manu- 
ally. One of them is Docker Desktop for Windows|] Currently, our program 
needs some Windows software in order to be built (e.g., Visual Studio Build 
Tools), so Docker needs to be configured to use Windows images. The other 
prerequisite is Visual Studio 20 19°] with Universal Windows Platform devel- 
opment workload and C++/WinRT installed. 


Downloading dependencies Because of space limitations, we do not in- 
clude sources of third-party dependencies in the attachment. Hence, they 
have to be downloaded and patched manually before the project can be built. 
Details are documented in file src/deps/README.md. Alternatively, we can 
install CMake and Ninja and run the following commands to download and 
patch the dependencies automatically: 


cd src/deps 
mkdir build && cd build 


!Docker Desktop for Windows can be found at https: //hub.docker.com/editions/ 
community/docker-ce-desktop-windows 


Some older versions of Visual Studio should also work. 
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cmake -G Ninja .. 
cmake --build . 


Building Docker container Inside the development environment, a few 
programs are installed to support the build. The main is a build system. To 
make the development easier, we use only one main build system—CMake 
with Ninja. Originally, the dependencies sometimes used different build 
systems and building the whole program was not easy as the build systems 
did not cooperate together. Some of the dependencies were already prepared 
to use our build system (e.g., Clang), others needed to be patched for it (e.g., 
WinObjC). 

To build most of our emulator (i.e., everything except the final 
application), including the development environment, we run the following 
commands inside folder src (i.e., the folder containing Dockerfile among 
other files): 


docker-compose build 
docker-compose run --name ipasim ipasim powershell 


Building the project The commands above create a Docker container 
with our development environment. Once inside the container, we run the 
following command to trigger the build. It invokes CMake and Ninja under 
the hood. 


./scripts/build.psi 


To actually get build outputs out of the container, we run the following 
command. It copies the outputs into the source tree, to a directory called 
cmake. 


./scripts/extract.psi Release 


Finally, to build IpaSimApp, we open solution src/src/IpaSimulator/ 
IpaSimApp.sln in Visual Studio and start it—we first switch to configura- 
tion Releasd?| and then in menu “Debug”, we select “Start Without Debug- 
ging". It automatically picks up the build artifacts generated in the steps 
described above. 


?Building configuration Debug is also possible but build artifacts must be copied using 
command ./scripts/extract.psi Debug. 
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Running the emulator When running the Visual Studio solution, the 
emulator launches automatically. To start the emulator later, it can be 
found among installed programs in the Start menu of Windows under name 
“ipaSim”. 

When running, the emulator displays two windows as can be seen in 
Figure 3.5] One window displays logging output and errors, if any, and the 
other shows the emulated application. 

The emulator requires an iPhone application as an input. It asks for a 
folder with the application’s contents at launch. This must be the folder with 
postfix . app (e.g., MyApp. app)—see Section [1.1.2]for more details. 


Getting samples Sample inputs for the emulator are contained in direc- 
tory src/samples. There are three sample iOS applications—HelloWorld, 
SampleApp (that is the one shown in Figure and IpasimBenchmark. 
These samples are written specifically to be simple enough for our emulator 
to handle them correctly. Unfortunately, more complex iOS applications can- 
not be executed by our emulator because there is still a lot of unimplemented 
functionality (both in WinObjC translations and in some of our ports). 

The samples can be built in Xcode on a Mac machine. Contents of an [IPA] 
package that our emulator takes as an input can be produced from Xcode’s 
menu “Product” by selecting action “Archive”. 


Since building the whole project takes a lot of time, a pre-built emulator 
and sample inputs are available in folder build and samples, respectively. 
Script build/Add-AppDevPackage .ps1 will install the emulator. It is built 
for Windows 10, build 1809 or later. 
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B. Attachments 


B.1 Source code and binaries 


Source code and pre-built binaries of the presented emulator and samples 
are available as an electronic attachment of this thesis. The attachment can 
be downloaded from Charles University Digital Repository [Cha17l. 

Source files of the emulator and samples are in folder src. Structure 
of this folder is documented in file src/README.md. Pre-built binary of the 
emulator is in folder build. Instructions for installation of the binary are 
provided in Appendix [A.1] Pre-built samples that can be used as inputs for 
the emulator are available in folder samples. 

This thesis in electronic form is available in file thesis.pdf. Source code 
from which that file was produced is in folder thesis. 
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